Overview

Dataset Statistics

Number of Variables 14
Number of Rows 10000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 2.1 MB
Average Row Size in Memory 217.0 B
Variable Types
  • Numerical: 6
  • Categorical: 8

Dataset Insights

UDI is uniformly distributed Uniform
Product ID has a high cardinality: 10000 distinct values High Cardinality
Product ID has constant length 6 Constant Length
Type has constant length 1 Constant Length
Machine failure has constant length 1 Constant Length
TWF has constant length 1 Constant Length
HDF has constant length 1 Constant Length
PWF has constant length 1 Constant Length
OSF has constant length 1 Constant Length
RNF has constant length 1 Constant Length
Product ID has all distinct values Unique
  • 1
  • 2

Variables


UDI

numerical

Approximate Distinct Count 10000
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 5000.5
Minimum 1
Maximum 10000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • UDI is uniformly distributed

Quantile Statistics

Minimum 1
5-th Percentile 500.95
Q1 2500.75
Median 5000.5
Q3 7500.25
95-th Percentile 9500.05
Maximum 10000
Range 9999
IQR 4999.5

Descriptive Statistics

Mean 5000.5
Standard Deviation 2886.8957
Variance 8.3342e+06
Sum 5.0005e+07
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.5773

Product ID

categorical

Approximate Distinct Count 10000
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory Size 710000

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row M14860
2nd row L47181
3rd row L47182
4th row L47183
5th row L47184

Letter

Count 10000
Lowercase Letter 0
Space Separator 0
Uppercase Letter 10000
Dash Punctuation 0
Decimal Number 50000
  • Product ID contains many words: 10000 words
  • Product ID has words of constant length

Type

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (L) is over 2.0 times larger than the second largest value (M)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row M
2nd row L
3rd row L
4th row L
5th row L

Letter

Count 10000
Lowercase Letter 0
Space Separator 0
Uppercase Letter 10000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (L, M) take over 50.0%
  • The largest value (l) is over 2.0 times larger than the second largest value (m)
  • Type has words of constant length

Air temperature [C]

numerical

Approximate Distinct Count 93
Approximate Unique (%) 0.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 26.8549
Minimum 22.15
Maximum 31.35
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Air temperature [C] is skewed right (γ1 = 0.1143)

Quantile Statistics

Minimum 22.15
5-th Percentile 23.95
Q1 25.15
Median 26.95
Q3 28.35
95-th Percentile 30.35
Maximum 31.35
Range 9.2
IQR 3.2

Descriptive Statistics

Mean 26.8549
Standard Deviation 2.0003
Variance 4.001
Sum 268549.3
Skewness 0.1143
Kurtosis -0.8361
Coefficient of Variation 0.07448

Process temperature [C]

numerical

Approximate Distinct Count 82
Approximate Unique (%) 0.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 36.8556
Minimum 32.55
Maximum 40.65
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Process temperature [C] is skewed right (γ1 = 0.015)

Quantile Statistics

Minimum 32.55
5-th Percentile 34.55
Q1 35.65
Median 36.95
Q3 37.95
95-th Percentile 39.35
Maximum 40.65
Range 8.1
IQR 2.3

Descriptive Statistics

Mean 36.8556
Standard Deviation 1.4837
Variance 2.2015
Sum 368555.6
Skewness 0.01503
Kurtosis -0.5001
Coefficient of Variation 0.04026

Rotational speed [rpm]

numerical

Approximate Distinct Count 941
Approximate Unique (%) 9.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 1538.7761
Minimum 1168
Maximum 2886
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Rotational speed [rpm] is skewed right (γ1 = 1.9929)

Quantile Statistics

Minimum 1168
5-th Percentile 1332
Q1 1423
Median 1503
Q3 1612
95-th Percentile 1868.05
Maximum 2886
Range 1718
IQR 189

Descriptive Statistics

Mean 1538.7761
Standard Deviation 179.2841
Variance 32142.787
Sum 1.5388e+07
Skewness 1.9929
Kurtosis 7.3886
Coefficient of Variation 0.1165
  • Rotational speed [rpm] is not normally distributed (p-value 3.2185664889653816e-05)
  • Rotational speed [rpm] has 418 outliers

Torque [Nm]

numerical

Approximate Distinct Count 577
Approximate Unique (%) 5.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 39.9869
Minimum 3.8
Maximum 76.6
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Torque [Nm] is skewed left (γ1 = -0.0095)

Quantile Statistics

Minimum 3.8
5-th Percentile 23.5
Q1 33.2
Median 40.1
Q3 46.8
95-th Percentile 56.1
Maximum 76.6
Range 72.8
IQR 13.6

Descriptive Statistics

Mean 39.9869
Standard Deviation 9.9689
Variance 99.3796
Sum 399869.1
Skewness -0.009515
Kurtosis -0.01383
Coefficient of Variation 0.2493
  • Torque [Nm] is not normally distributed (p-value 0.001194942702409783)
  • Torque [Nm] has 69 outliers

Tool wear [min]

numerical

Approximate Distinct Count 246
Approximate Unique (%) 2.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 160000
Mean 107.951
Minimum 0
Maximum 253
Zeros 120
Zeros (%) 1.2%
Negatives 0
Negatives (%) 0.0%
  • Tool wear [min] is skewed right (γ1 = 0.0273)

Quantile Statistics

Minimum 0
5-th Percentile 9.95
Q1 53
Median 108
Q3 162
95-th Percentile 206.05
Maximum 253
Range 253
IQR 109

Descriptive Statistics

Mean 107.951
Standard Deviation 63.6541
Variance 4051.8504
Sum 1.0795e+06
Skewness 0.02729
Kurtosis -1.1668
Coefficient of Variation 0.5897
  • Tool wear [min] is not normally distributed (p-value 9.67426161679366e-06)

Machine failure

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 28.5 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 28.5 times larger than the second largest value (1)
  • Machine failure has words of constant length

TWF

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 216.39 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 216.39 times larger than the second largest value (1)
  • TWF has words of constant length

HDF

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 85.96 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 85.96 times larger than the second largest value (1)
  • HDF has words of constant length

PWF

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 104.26 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 104.26 times larger than the second largest value (1)
  • PWF has words of constant length

OSF

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 101.04 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 101.04 times larger than the second largest value (1)
  • OSF has words of constant length

RNF

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 660000
  • The largest value (0) is over 525.32 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 10000
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 525.32 times larger than the second largest value (1)
  • RNF has words of constant length

Interactions

Correlations

Missing Values